We will cover the modularization and configuration of nextflow scripts.
Creating a nf-core pipeline
A new pipeline with a template of nf-core pipeline can be created using nf-core create command.
Example:
nf-core create -n testpipeline -d "Test pipeline" -a "Diya" --plain
Output:
bash
tree -d nf-core-testpipeline/
nf-core-testpipeline/
├── assets
├── bin
├── conf
├── docs
│ └── images
├── lib
├── modules
│ ├── local
│ └── nf-core
│ ├── custom
│ │ └── dumpsoftwareversions
│ │ └── templates
│ ├── fastqc
│ └── multiqc
├── subworkflows
│ └── local
└── workflows
17 directories
Modules
Stand-alone module scripts can be included and shared across multiple workflows. Each module can contain its own process or workflow definition.
Example:
bash
cat bin/nf-modules/modules/cutadapt.nf
#!/usr/bin/env nextflow
process CUTADAPT {
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/cutadapt:3.4--py39h38f01e4_1' :
'biocontainers/cutadapt:3.4--py39h38f01e4_1' }"
input:
tuple val(meta), path(reads)
output:
tuple val(meta), path('*.trim.fastq.gz'), emit: reads
tuple val(meta), path('*.log') , emit: log
path "versions.yml" , emit: versions
when:
task.ext.when == null || task.ext.when
script:
def args = task.ext.args ?: ''
def prefix = meta
def trimmed = params.single_end ? "-o ${prefix}.trim.fastq.gz" : "-o ${prefix}_1.trim.fastq.gz -p ${prefix}_2.trim.fastq.gz"
"""
cutadapt \\
$args \\
$trimmed \\
$reads \\
> ${prefix}.cutadapt.log
cat <<-END_VERSIONS > versions.yml
"${task.process}":
cutadapt: \$(cutadapt --version)
END_VERSIONS
"""
stub:
def prefix = task.ext.prefix ?: meta
def trimmed = params.single_end ? "${prefix}.trim.fastq.gz" : "${prefix}_1.trim.fastq.gz ${prefix}_2.trim.fastq.gz"
"""
touch ${prefix}.cutadapt.log
touch ${trimmed}
cat <<-END_VERSIONS > versions.yml
"${task.process}":
cutadapt: \$(cutadapt --version)
END_VERSIONS
"""
}
Importing Modules
Components defined in the module script can be imported into other Nextflow scripts using the include statement.
This allows to store these components in a separate file(s) so that they can be re-used in multiple workflows.
Example:
include { BOWTIE2_BUILD} from '/home/diya/nf-modules/modules/bowtie2_build'
Module aliases
When including module component it is possible to specify a name alias using the as declaration.
This allows the inclusion and the invocation of the same component multiple times using different names
Example:
include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_CUTADAPT } from '/home/diya/nf-modules/modules/bowtie2_align'
include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_TRIMMOMATIC } from '/home/diya/nf-modules/modules/bowtie2_align'
Main script
Main.nf is typically the main script which is executed using the nextflow run command to execute the whole pipeline. The workflow is imported and invoked here. Thus, when the main script is executed the workflow is invoked which in turn invokes the modules and subworkflows and as a result the process are executed.
Example:
bash
cat bin/nf-modules/main.nf
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
// Include workflow
include { TRIMALIGN } from "/home/diya/nf-modules/workflows/trimalign"
workflow {
TRIMALIGN()
}